Structured and Unstructured Document Summarization: Design of a Commercial Summarizer using Lexical Chains
نویسندگان
چکیده
The process of summarizing documents is becoming increasingly important in the light of recent advances in document creation/distribution technology, and the resulting influx of large numbers of documents in every day life. This paper presents a document summarizer that combines document analysis, structural decomposition, XML representation and lexical chain analysis. The proposed summarizer is compared to three commercially available summarizers and it is shown that it produces either comparable or better summaries overall.
منابع مشابه
WordNet-based Summarization of Unstructured Document
This paper presents an improved and practical approach to automatically summarizing unstructured document by extracting the most relevant sentences from plain text or html version of original document. This technique proposed is based upon Key Sentences using statistical method and WordNet. Experimental results show that our approach compares favourably to a commercial text summarizer, and some...
متن کاملAn EÆcient Text Summarizer Using Lexical Chains
We present a system which uses lexical chains as an intermediate representation for automatic text summarization. This system builds on previous research by implementing a lexical chain extraction algorithm in linear time. The system is reasonably domain independent and takes as input any text or HTML document. The system outputs a short summary based on the most salient concepts from the origi...
متن کاملText Summarization Using Lexical Chains
Text summarization addresses both the problem of selecting the most important portions of text and the problem of generating coherent summaries. We present in this paper the summarizer of the University of Lethbridge at DUC 2001, which is based on an efficient use of lexical chains.
متن کاملAn Efficient Text Summarizer using Lexical Chains
We present a system which uses lexical chains as an intermediate representation for automatic text summarization. This system builds on previous research by implementing a lexical chain extraction algorithm in linear time. The system is reasonably domain independent and takes as input any text or HTML document. The system outputs a short summary based on the most salient concepts from the origi...
متن کاملCohesion and coherence for Automatic Summarization
This paper presents the integration of cohesive properties of text with coherence relations, to obtain an adequate representation of text for automatic summarization. A summarizer based on Lexical Chains is enchanced with rhetorical and argumentative structure obtained via Discourse Markers. When evaluated with newspaper corpus, this integration yields only slight improvement in the resulting s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003